A Fast and Sensitive Algorithm for Aligning Ests to the Human Genome

نویسندگان

  • Jun Ogasawara
  • Shinichi Morishita
چکیده

There is a pressing need to align the growing set of expressed sequence tags (ESTs) with the newly sequenced human genome. However, the problem is complicated by the exon/intron structure of eukaryotic genes misread nucleotides in ESTs, and the millions of repetitive sequences in genomic sequences. To solve this problem, algorithms that use dynamic programming have been proposed. In reality, however, these algorithms require an enormous amount of processing time. In an effort to improve the computational efficiency of these classical DP algorithms, we developed software that fully utilizes lookup-tables to detect the start- and endpoints of an EST within a given DNA sequence efficiently, and subsequently promptly identify exons and introns. In addition, the locations of all splice sites must be calculated correctly with high sensitivity and accuracy, while retaining high computational efficiency. This goal is hard to accomplish in practice, due to misread nucleotides in ESTs and repetitive sequences in the genome. Nevertheless, we present two heuristics that effectively settle this issue. Experimental results confirm that our technique improves the overall computation time by orders of magnitude compared with common tools, such as SIM4 and BLAT, and simultaneously attains high sensitivity and accuracy against a clean dataset of documented genes.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Fast and Sensitive Algorithm for Aligning ESTs to Human Genome

There is a pressing need to align growing set of expressed sequence tags (ESTs) to newly sequenced human genome. The problem is, however, complicated by the exon/intron structure of eucaryotic genes, misread nucleotides in ESTs, and millions of repeptive sequences in genomic sequences. Indeed, to solve this, algorithms that use dynamic programming have been proposed, but in reality, these algor...

متن کامل

P-215: Discovery of A Novel APA Variant of A Human Potential Gene Based on Expressed Sequenced Tags Analysis

Background: Expressed sequence tags (ESTs) are sequences of cDNA fragments prepared from different tissue sources. There are over one million of these sequences in the publicly available database, and these sequences are believed to represent more than half of all human genes. The ESTs belong to different cDNA libraries, was prepared from one particular cell type, organ, or tumor. Therefore, th...

متن کامل

A Reliable Routing Algorithm for Delay Sensitive Data in Body Area Networks

Wireless body Area networks (WBANs) include a number of sensor nodes placed inside or on the human body to improve patient health and quality of life. Ensuring the transfer and receipt of data in sensitive data is a very important issue. Routing algorithms should support a variety of service quality such as reliability and delay in sending and receiving data. Loss of information or excessive da...

متن کامل

Haplotype Block Partitioning and tagSNP Selection under the Perfect Phylogeny Model

Single Nucleotide Polymorphisms (SNPs) are the most usual form of polymorphism in human genome.Analyses of genetic variations have revealed that individual genomes share common SNP-haplotypes. Theparticular pattern of these common variations forms a block-like structure on human genome. In this work,we develop a new method based on the Perfect Phylogeny Model to identify haplo...

متن کامل

Designing a model for aligning human resource strategies with the organization’s strategy in the field of succession management

Succession management is one of the human resource management strategies and requires a strategic approach for the long-term future of the organization. Due to the multiple benefits and synergies that align human resource strategies with the organization’s strategy, the present study aimed to design a model of alignment of human resource strategies with the organization's strategy in the field ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Journal of bioinformatics and computational biology

دوره 1 2  شماره 

صفحات  -

تاریخ انتشار 2003